DiscoverMicrosoft Research PodcastAbstracts: NeurIPS 2024 with Weizhu Chen
Abstracts: NeurIPS 2024 with Weizhu Chen

Abstracts: NeurIPS 2024 with Weizhu Chen

Update: 2024-12-06
Share

Description

Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.

Read the paper

Get the code

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Abstracts: NeurIPS 2024 with Weizhu Chen

Abstracts: NeurIPS 2024 with Weizhu Chen

Researchers across the Microsoft research community